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Abstract 

While methods of code abstraction and reuse are widespread 
and well researched, methods of proof abstraction and reuse 
are still emerging. We consider the use of dependent types 
for this purpose, introducing a completely mechanical ap- 
proach to proof composition. We show that common tech- 
niques for abstracting algorithms over data structures natu- 
rally translate to abstractions over proofs. We first introduce 
a language composed of a series of smaller language compo- 
nents tied together by standard techniques from Malcom [2]. 
We proceed by giving proofs of type preservation for each 
language component and show that the basic ideas used in 
composing the syntactic data structures can be applied to 
their semantics as well. 

1. Introduction 

The POPLmark challenge is a set of common programming 
language problems meant to test the utility of modern proof 
assistants and techniques for mechanized metatheory. In re- 
sponse to this challenge, significant strides have been made 
in making it easier to mechanize the metatheory of program- 
ming languages, especially regarding variable binding [1]. 
However, little progress has been made in the direction of 
modularity: it is still difficult to separately develop the def- 
initions and meta-theory of language fragments and then 
link the fragments together to obtain the definitions and 
meta-theory for a language composed of such fragments. 

Dependent types have formed the foundation of a broad 
and rich range of type systems that allow values and types 
to be freely mixed. Programmers can express propositions 
as types viewed as sets, and proofs as objects viewed as 
inhabitants of those sets. This style of theorem proving sug- 
gests the use of familiar engineering abstractions as general 
solutions to questions about theorem proving. Rather than 
relying on semi-automated proof search such as Coq's Ltac 
we propose a method of proof composition using simple ab- 
stractions whereby components are defined piecewise and 
"tied" together at the end using a wrapper datatype acting 
as a tagged union. 

The method of language definition used is iterative. Com- 
ponents are defined separately from one another and are 
composable along with their proofs. Thus we would like for 



separate language designers to be able to reuse one anoth- 
ers' work without the need for sophisticated proof search 
algorithms or with effort spent copying and pasting terms. 

The language we present is one of simple expressons using 
Agda as the implementation language and proof assistant. 
We begin by defining a series of language syntaxes for sums, 
options, and arrays. We chose to include arrays because they 
not only can result in runtime errors requiring the inclusion 
of the Option type but like addition, they use the natural 
numbers, forcing consideration of how value types can be 
shared across otherwise isolated components. We continue 
by defining evaluation semantics and typing rules. The lan- 
guage is defined piecewise, each component is built in iso- 
lation alongside a proof of type preservation. We conclude 
with a presentation of how these components can be com- 
posed and a proof of type preservation for the combined 
language can be immediately derived from the component- 
wise proofs. The motivation for our technique is drawn from 
a solution to the expression problem where languages are 
defined as the disjoint sum of smaller languages by remov- 
ing explicit recursion. We show that this idea can be recast 
from types and terms, to proofs. 



2. A Review of the Expression Problem 

When modeling a problem with a functional flavor often the 
natural solution emerges as several recursive cases handled 
by some helper functions. The expression problem states 
that this type of solution presents us with a choice: we may 
ordain our data structure forever unchanging, making it easy 
to add new functions without changing the program; or we 
may leave our data structure open, making it difficult to 
extend the original program with new functions. 

While many solutions to the expression problem have 
been proposed over the years, here we make use of the 
method described by Malcom [2] which generalizes recursion 
operators such as fold, from lists to polynomial types. The 
problem we encounter arises as a result of algebraic data 
types being closed: once the type has been declared, no new 
constructors for the type may be added without amending 
the original declaration and the solution presented lies at the 
heart of our work. The idea is simply to remove immediate 
recursion and split a monolithic datatype into components 
to be later collected under the umbrella of a tagged union. 

Throughout this paper we will work with a simple eval- 
uator over natural numbers and basic arithmetic operators; 
in Agda we might first consider 
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data Expr+ : Set where 
atom : N — > Expr+ 
_+_ : Expr-| > Expr+ 



Expr+ 
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This definition has the advantage of being direct and simple, 
however a problem lies within the explicit recursion; notice 
that when later extending expressions with arrays and op- 
tion types we can make no reuse of Expr+ due to the closed 
nature of algebraic data types. To extend Expr+ we must 
define a whole new data type, as in the following definition 
of MonolithicExpr. 

data MonolithicExpr : Set where 
atom : N — > MonolithicExpr 
esome : MonolithicExpr — > MonolithicExpr 
enone : MonolithicExpr 
nil [] : MonolithicExpr 
_!!_ : MonolithicExpr — > MonolithicExpr 

—¥ MonolithicExpr 
_ [_] :=_ : MonolithicExpr — ¥ MonolithicExpr 

— ¥ MonolithicExpr — ¥ MonolithicExpr 
_+_ : MonolithicExpr — ¥ MonolithicExpr 
— > MonolithicExpr 
fromExpr+ : ExpM — ¥ MonolithicExpr 
fromExpr+ (atom n) = atom n 

fromExpr+ (n + m) = fromExpr+ n + fromExpr+ m 

Suppose instead we begin with polymorphic definitions such 
as the following. 

data Expr+2 (A : Set) : Set where 

_+_ : A ->• A ->• Expr+2 A 
data Expr [] 2 (A : Set) : Set where 

nil [] : Expr [] 2 A 

_!!_ : Expr [] 2 A 

_[_]:=_ : A -> A -> A -> Expr [] 2 A 
data ExprOption (A : Set) : Set where 

esome : N — ¥ ExprOption A 

enone : N — > ExprOption A 
data Lit (A : Set) : Set where 

atom : N ->■ Lit A 



We then introduce recursion as follows, combining compo- 
nents as a disjoint sum, written — tbl — in Agda. 

data RecExpr : Set where 
expr : Lit RecExpr 
tbl Expr+2 RecExpr 
ttl Expr [] 2 RecExpr 
tbl ExprOption RecExpr 
— > RecExpr 

More generally, this type of data can be captured using a 
"categorical approach" where recursion is introduced as the 
fixed point of a functor: 

data fj,_ (F : Set — ¥ Set) : Set where 

inn : F (fj, F) — ¥ fi F 
Expr' = A (A : Set) -> Lit A tbl Expr+ 2 A 

ttl Expr [] 2 A 

tbl ExprOption A 
Expr = fi Expr' 



It is easy to see that this new type is equivalent to 
MonolithicExpr up to isomorphism 

Expr — n Expr' 

= Expr' (fiExpr') 

= Expr' Expr 

= atom 
I esome \ enone 
I Expr + Expr 
I nil[] I -!! - I - [-] := - 

2.1 Functors and Agda 

The functor F, passed into fx— above, serves as the key 
abstraction allowing us to represent expressions as least fixed 
points. Functors are a special mapping defined over both 
types and functions satisfying the so called functor laws; a 
functor F 

1. assigns to each type A, a type FA 

2. assigns to each function / : A — ¥ B, a function map / : 
FA^ FB 

such that 

1. identity is preserved: map id = id, and 

2. when / o g is defined: map {fog) — map / o map g. 

One familiar example is the List functor mapping each type 
A to List A and each function / : A — ¥ B to map / : 
List A —¥ List B which applies / to each element of a 
list. Here we define the least fixed point over a restricted 
class of functors called the polynomial functors. Polynomial 
functors are a subset roughly equivalent to the more familiar 
algebraic polynomials, 

ngiVCN 

where addition is disjoint sum and multiplication is cartesian 
product. In Agda, Ulf Norell[4] expresses this class as a 
datatype Functor along with an interpretation as a set [— ] 

infixl 6 _©_ 
infixr 7 _®_ 

data Functor : Seti where 
X : Functor 
A : Set — ¥ Functor 

_ffi_ : Functor — > Functor — ¥ Functor 
: Functor — ¥ Functor — ¥ Functor 
[_] : Functor — > Set ->■ Set 
[X] B = B 
[AC] B = C 

[FffiG] B = [F] B tbl [G] B 
[F ® G] B = [F] B x [G] B 

with least fixed point 

data u_ (F : Functor) : Set where 
inn : [F] (fj, F) ->• u F 

Then to reexpress Expr as a polynomial functor we use sum 
— © — to define cases within a type, and product — ® — to 
represent arguments of a particular case 

Option 1 : Functor 
Optioni = X © A T 
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Arrayi : Functor 

Arrayi = XffiXffiXffiATffiXffiX 
Sumi : Functor 
Sumi = X ® X 
Fi : Functor 

Fi = AN® Optioni ffi Sumi ffi Arrayi 
Ei : Set 
Ei = n Fi 

Unfolding E\ yields the same value calculated above — as we 
should hope! 

Ei = fj, Fi 

= [A N ffi Optiom Sumi Arrayi] 

= [AN] OuFi) 
a [Optiom] (p Fi) 
tbl [Sumi] (n Fi) 
t±l [Arrayi] (ji Fi) 

= N 

tbl (/x Fi) x T 
tbl (/x Fi) x (xx Fi) 

tbl (/x Fi) x (n Fi) tbl (ji Fi) tbl T HJ (/x Fi) tbl (/tt Fi) 

What do values in E\ look like? Written directly they appear 
nonsensical, consider 6 + 7 

the-sum : Ei 

the-sum = inn (inji (inj2 ( 
(inn (inji (inji (inji 6)))) 
, (inn (inji (inji (inji 7))))))) 

Notice here the role that the injections and inn functions 
play. Traditionally we would provide a unique name for each 
branch in an algebraic datatype, however here we only have 
two names inji and inji so we instead rely on nesting to 
create unique prefixes. Once we have tagged a value we must 
give it a well known type so that parent expressions can 
expect a common child type, this is the role of inn. Although 
cumbersome we can hide much of this complexity provided 
the right abstractions 

the-sum' : Ei 

the-sum' = nati 6 +i nati 7 
where nati : N — > Ei 

nati = inn o inji ° inji ° inji 

_+i_ : Ei ->■ Ei -> Ei 

ei +1 e 2 = inn (inji (inj 2 (ei,e 2 ))) 



3. Syntax and Evaluation Semantics 

We are now ready to define a simple language and its 
operational semantics. The language is small including just 
sums, an option type, and an array with assignment and 
lookup. In Agda, the unit type is written T and has only 
one member: tt. T is used to represent constructors that 
take no arguments such as nil, the empty list. 

Option : Functor 
Option = X © A T 

Array : Functor 

Array = X®X®XffiATffiXffiX 

Sum : Functor 

Sum = X © X 

FExpr : Functor 

FExpr = A N ffi Option ffi Sum ffi Array 



Expr : Set 
Expr = xx FExpr 

What do each of these definitions mean? The maybe type 
has two constructors: some, which wraps a single expression; 
and none taking no arguments. We define more descriptive 
constructors for tagging these two types of values 

nonei : Expr 

nonei = inn (inji (inji (inj 2 (inj 2 tt)))) 

somei : Expr — > Expr 

somei = inn o inji ° inji ° inj 2 o inji 

Giving a convenient constructor for — + — is similarly 
straightforward 

enat : N — > Expr 

enat = inn o inji ° inji ° inji 

_+_ : Expr — > Expr — > Expr 

ei + e 2 = inn (inji (inj 2 (ei,e 2 ))) 

and to define arrays we have assignment taking an array, an 
index, and a value to assign at that index; nil, the empty 
array; and lookup which accepts an array and an index 

_ [_] :=i_ : Expr — > Expr — ► Expr — ► Expr 
a [i] :=i e = inn (inj 2 (inji (inji (a,i,e)))) 
nili : Expr 

nili = inn (inj 2 (inji (inj 2 tt))) 
_!i_ : Expr — > Expr — > Expr 
a !i i = inn (inj 2 (inj 2 (a, i))) 

So far the definition of our syntax has used fairly standard 
techniques but we have failed to give any sort of meaning to 
these expressions. We first define a monolithic static and 
dynamic semantics for this language, then show how to 
modularize their definition later in this section. Figure lc 
defines a simple set of typing rules using metavariables e to 
range over experssions and n to range over values; Figure lb 
gives a small step operational semantics. 

While Agda is expressive enough to implement these 
rules, directly and indeed they are nearly a direct reflec- 
tion of that implementation, recall that our goal is to cre- 
ate several independant languages each carrying their own 
semantics. We begin by defining monolithic semantics for 
Expr and proceed to determine points of failure and to dis- 
sect the definition into independant constituents. To simplify 
things we define our notion of Type as a closed ADT 

data Type : Set where 
TArray : Type 
TOption : Type 
TNat : Type 

and here is the definition of the monolithic type system and 
evaluation relation in Agda. 

data Welltyped : Expr — > Type — > Seti where 
ok-value : {n : N} — ► Welltyped (enat n) TNat 
ok-sum : {ei e 2 : Expr} 

->■ Welltyped ei TNat ->■ Welltyped e 2 TNat 

->• Welltyped (ei + e 2 ) TNat 
ok-nil : Welltyped nili TArray 
ok-lookup : {a e : Expr} 

->■ Welltyped a TArray 

->■ Welltyped e TNat 

->■ Welltyped (a !i e) TOption 
ok-ins : {a e n : Expr} 

->■ Welltyped a TArray 
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(lookup) - 



m+e 2 



• H — G Expr -> £xpr -»■ Szpr ^^-r, ni + e2 

n G N G Expr 

:= — G Expr — > Expr — » iJxpr — > Expr , , 

(sum) : 

— !— G Expr -¥ Expr — > Expr ni+n 2 — > ni + ri2 

nil G Expr 

e L e ' 

(a) Syntax (stepi) — 



a\n — > L\a, n] 
(b) Evaluation Semantics 



K-value) n Nat (ok-sum) 61 : ^ 62 1 ^ 

(c) Value Typing 61+63 : ^ 

(d) Sum Typing 



(ok-nil) — 

nil : Array 



a : Array e : Nat 

(ok-lookup) — 



(ok-ins) - 



ale : Option 

a : Array n : Nat e : Nat 
a[n] := e : Array 
(e) Array Typing 



->• Welltyped e TNat 
->■ Welltyped n TNat 
-¥ Welltyped (a [n] :=i e) TArray 

infix 2 _ >E_ 

data _ — >E_ : Expr — > Expr — > Set where 
stepi : {ei ei' e 2 : Expr} 

— > ei — >E ei' 
-¥ ei + e 2 — ^E ei' + e 2 
stepr : {ni : N} {e 2 e 2 ' : Expr} 
— > e 2 — ^E e 2 ' 

— ► enat ni + e 2 — ^E enat ni + e 2 ' 
sum : {ni n 2 : N} 

— >■ enat ru + enat n 2 — s>E enat (ni +N n 2 ) 
stepi : {e e' a : Expr} 

— ¥ e — >E e' 

— i> a !i e — ^E a !i e' 
lookup : {a n : Expr} 

->• a !i n — ^E L| a, n ]i 

The function L\— , — ]i is the lookup function that evaluates 
to some a n when a n has been defined and none otherwise. 



Notice that we currently do not restrict the values of n 
enough in the ok-ins rule; our typing rules require that n 
be a value while in Agda we have only required it be an 
expression. Some notion of value is needed and a common 
solution is to add a tag Value to the Expr type and pattern 
match; here Value is called [AN] and in a dependantly 
typed context we might then define a predicate over Value. 
However because the sum type has only one type of value, 
a number, it is simpler to use enat directly. 

This method for defining semantics is common with the 
advantage of being direct and concise, but similar to our 
first implementation of Expr+ and MonolithicExpr above: 
there is no simple mechanism for code reuse. The answer is 
again to delay recursion. 



3.1 Dissecting the Step Relation 

In order to modularize the evaluation rules we define a 
separate step relation for each functor making up our Expr 
type. First note that — + — doesn't make use of how the step 
from e\ to e 2 occurs so we can factor this top-level relation 
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data _ — > + _ {_ — >_ : Expr — > Expr — s> Set} 
: Expr — > Expr — > Set where 
stepl : {ei ei' e2 : Expr} 

— > ei — > ei' — > ei + e 2 — > + ei' + e 2 
stepr : {ru : N} {e 2 e 2 ' : Expr} 

->■ e 2 — > e 2 ' 

— >■ enat ni + e 2 — > + enat ru + e 2 ' 
sum : {ni n 2 : N} 

— > enat ni + enat n 2 — > + enat (ru +N n 2 ) 

While this is better there is still an undesirable reference 
to the datatype Expr. Applying the same factorization 
here to the underlying functor requires parametrization by 
two extra coercion functions, these are the — i — and enat 
functions defined previously. The new names lift + and liftN 
used here are meant to imply that a subtype is being "lifted" 
into its supertype 

data _ — >+_ {E : Functor} {_ — >_ : ^E->/iE-> Set} 
{lift+ : [Sum] (ji E) -»• p E} {liftN : N -> a E} 
: (iE->/iE-> Set where 
stepl : {ei ei' e 2 : p E} 

-¥ ei — ► ei' — > lift" 1 " (ei,e 2 ) — > + lift" 1 " (ei',e 2 ) 
stepr : {ni : N} {e 2 e 2 ' : a E} 
->■ e 2 — > e 2 ' 

->• lift+ (liftN m,e 2 ) — > + lift 4 " (liftN ni,e 2 ') 
sum : {ni n 2 : N} 

->■ lift+ (liftN m, liftN n 2 ) — >+ liftN (m +N n 2 ) 

Unfortunately this definition falls short too. When we lift 
terms into the expression type pE, Agda "forgets" the con- 
stituents ei and e 2 — in turn we lose the ability to reason 
about these distinct components of the sums e n and e' n . This 
later becomes a problem when, for example, attempting to 
abstract the welltyping relation. 

An intelligent human can peel away lift + and see that 

the terms ei and e 2 in > — and > + — are the same 

because lift + is injective. However Agda is unconvinced, 
and rightfully so, for it does not require a particularly great 
deal of ingenuity to find a counterexample, consider taking 
E = FExpr so that uE — Expr 

forgetful-lift + : [Sum] Expr — > Expr 
forgetful-lift + (ei,e 2 ) = enat 

The problem is that our abstraction is too general. What 
we require is a proof that [Sum](uE) and N are subtypes 
of the top-level expression datatype pE. The solution to the 
problem is drawn from the notion of a categorical subobject. 

We proceed by delaying application of injections and view 
the objects as injectable, existential terms. The importance 
of this approach is two-fold: firstly this allows us to take 
inverses of lift functions while we are secondly able to retain 
the perspective of operating on a single type uE. 

4. Lazy Coercions 

A subobject of a type T is a left invertible function with 
codomain T, lift : S hT. Being restricted to polynomial 
functors, we know that all our subobjects lift : S — > pE will 
be some composition of inn, inj\ and iry 2 so a proof that S 
is a subtype of pE is merely a description of which direction 
to move at each point in a disjoin sum 

infix 3 _Contains_ 

data _Contains_ : Functor — > Functor — > Seti where 
refl : {F : Functor} 



— > F Contains F 
left : {A B F : Functor} 

— ¥ F Contains A © B — ► F Contains A 
right : {A B F : Functor} 

->■ F Contains A © B ->■ F Contains B 

Now we can define containment on a functor's interpretation 
as a set 

infix 3 

data : Set — >• Set — > Seti where 

inj : {FA : Functor} 

->• F Contains A -> [A] (u F) >-» (a F) 

with conversion functions defined as 

upcast : V{FA}^F Contains A ->• [A] (fx F) ->• fx F 

upcast refl = inn 

upcast (left t) = upcast t o inj i 

upcast (right t) = upcast t o inj 2 

apply : {A B : Set} ->■ (A >-> B) ->■ A ->■ B 
apply (inj t) — upcast t 

Recall the two goals we had in mind. We first wished to take 
the inverse of a lift function to gain access to its arguments, 
in the case of — | — these were e\ and e 2 . By representing an 
injection as a delayed application of a subobject — because 
the constructor's arguments are stored as a part of the 
coercion — finding left inverses will become a trivial case of 
pattern matching. To delay function application allowing 
Agda to effectively peel away the lift functions we define 
a LazyCoercion datatype from type A to B representing the 
intention of coercing an object a £ A while treating it at the 
type-level as B. A lazy coercion is then an injection A >— » B 
along with an object in A 

data LazyCoercion : Set — > Seti where 

inj : {A B : Set} -»■ (A >-» B) -»■ A ->■ LazyCoercion B 
coerce : {B : Set} — > LazyCoercion B — > B 
coerce (inj f e) — apply fe 

Our second goal was to operate on objects of a single type. 
Why is this the case? Recall that the type of our step relation 
is indexed by two expressions: (ei : Expr) — >e (e 2 : Expr). 
We should expect the same of the final abstraction over step 
relations because it cannot easily name the underlying type 
of its indexing expressions. Instead we have packaged the 
indices as existentials which are viewed as the type B. 

We seem to be close to a modular step relation > + — , 

defining at each point another level of abstraction to delay 
immediate application. To modularize datatypes, recursion 
is delayed and types are viewed as polynomial functors, 
then to modularize step relations, evaluation is parametrized 
and expression upcasts are delayed by viewing them as an 
intention. 

5. Defining a Modular Step Relation 

Attempting again to define a step relation for addition we 
find very little has changed 

data _ — > + _ {E : Functor} 
{_ — >_ : fiE— >^iE^Seti} 
{lift+ : [Sum] (p E) >-> fi E} 
{liftN : N >-» fi E} 

: LazyCoercion (p E) — > LazyCoercion (p E) — > Seti 
where 

stepl : {ei ei' e 2 : p E} 
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-¥ ei — ► ei' 

-> inj lift+ (ei,e 2 ) — > + inj lift+ (ei',e 2 ) 
stepr : {ei e 2 e 2 ' : Li E} 

-> e 2 — > e 2 ' 

->■ inj lift" 1 " (ei,e 2 ) — > + inj lift" 1 " (ei,e 2 ') 
stepv : { n m : N} 

-¥ inj lift" 1 " (apply liftN n, apply liftN m) 
— >+ inj liftN (n +N m) 

It appears we've littered an otherwise simple definition with 
inj but we've replaced our arbitrary arrows with objects 
having constructors we can match on. Using the above 
techniques we can modularize the welltyping relation over 
sums for free 

dataWtSum{E : Functor} 
{ Wt : p, E -> Type ->■ Seti } 
{lift+ : [Sum] (^E)w/i E} 
: LazyCoercion (fj, E) — > Type — > Seti where 
ok-sum : {ei e 2 : u E} 

-> Wt ei TNat -s- Wt e 2 TNat 

-> WtSum (inj lift 4 " (ei,e 2 )) TNat 

The above definitions nearly wrote themselves. The simplic- 
ity comes from the fact we are just abstracting as many 
terms as possible, keeping in mind we can fill them in nat- 
urally later because the abstraction is so general there are 
few options available. 

5.1 Arrays 

We proceed by defining the step and welltypedness relations 
on arrays that can be combined with the relations on sums. 
The definitions for evaluation and welltypedness should look 
similar to those for sums > + — . 

data — >[]-{E : Functor} 
{_ — >_ : ^E-^^iE^Seti} 
{liftA : [Array] (fx E) >-> a E} 
{liftN : NwfiE} 
{liftO : [Option] (/jE)h/i E} 
: LazyCoercion (fj, E) — > LazyCoercion ([/, E) — > Seti 
where 

stepi : {e e' a : (iE}->e — ► e' 

->■ inj liftA (a ! e) — ► [] inj liftA (a ! e') 

lookup : {a : [Array] (a E)} {n : N} 
->■ inj liftA (apply liftA a ! apply liftN n) 
— ► [] inj liftO L[a,n] 

To define the typing relation we again follow the format of 
WtSum above and we are done. 

data WtArray { E : Functor} 
{Wt : ^E-> Type -¥ Seti } 
{liftA : [Array] (a E) > — > (/u. E) } 
{liftN : NwfiE} 

: LazyCoercion (a E) — > Type — > Seti where 
ok-nil : WtArray (inj liftA nil) TArray 
ok-ins : {a e n : (i E) 

->■ Wt a TArray ->■ Wt e TNat ->■ Wt n TNat 

->■ WtArray (inj liftA (a [n] := e)) TArray 
ok-lookup : {e a : /i E} 

->■ Wt a TArray ->■ Wt e TNat 

->■ WtArray (inj liftA (a ! e)) TOption 



6. Proving Type Preservation 

The type preservation lemma states that if a term is well- 
typed and can step, then the type of the term is preserved 
after evaluation 

e — > e A e : T e : T (type-preservation) 

Prior to considering how type preservation might look for 
each of the previously defined components we should review 
what type preservation looks like for the MonolithicExpr 
language. The proof is standard, proceeding by structural 
induction on the shape of the welltyping tree. 

preservation-MonolithicExpr : V{ee'}{r} 
—¥ e — >C e' 

— > WtMonolithicExpr e r 
— > WtMonolithicExpr e' r 
preservation-MonolithicExpr (stepi stei) (ok-sum wtei wte 2 ) 
= ok-sum (preservation-MonolithicExpr stei wtei) wte 2 
preservation-MonolithicExpr (stepr ste 2 ) (ok-sum wtei wte 2 ) 

= ok-sum wtei (preservation-MonolithicExpr ste 2 wte 2 ) 
preservation-MonolithicExpr 

(stepv {n} {m}) (ok-sum wtn wtm) 
= ok-nat (n +N m) 
preservation-MonolithicExpr (stepi ste) (ok-lookup wta wte) 
= ok-lookup wta (preservation-MonolithicExpr ste wte) 
preservation-MonolithicExpr 

(lookup {a} {n}) (ok-lookup wta wtn) 
= proj 2 LC[ a, n ] 

There are three items worth noting here: the first is the 
use of the function LCJ— ,— ] : MonolithicExpr — > N — > 
Be. WtMonolithicExpr e TOption which we have assumed 
produces a pair with first component an expression and 
second component a proof that the expression is a welltyped 
option; the second is that recursion acts as our induction 
hypothesis; and finally that Agda is smart enough to notice 
there is only a single possible welltyping constructor for each 
step constructor — in Agda all functions are total. 

We should expect the modular type preservation lemmas 
to look similar because there is little global knowledge in- 
volved. The induction hypothesis and values aside, each case 
is "contained within its own world" in the sense that each 
evaluation rule relies only on the fact that subterms are well- 
typed but ignoring the reason they are welltyped. To show 
type preservation for sums we might start with 

preservation-Sumi : {r : Type} {E : Functor} 

fee' : LazyCoercion (fj, E) } 

— > e — > + e' 

— ► WtSum e t 

->■ WtSum e' r 
preservation-Sumi 

(stepi {ei} {ei'} {e 2 } stei) (ok-sum wtei wte 2 ) = * 

however recall that — — > + — requires the top-level step 
relation and proof that E contains both sums and naturals. 
There is a second mistake in writing preservation this way — 
we would like to show that e' is welltyped in the expression 
language, not just necessarily in the modular sum language, 
this reflects our desire to expose as little about each compo- 
nent as possible. A second formulation might then begin as 
follows but we again fail. 

preservation-Sum 2 : {r : Type} 
{E : Functor} 

{_ — >_ : fiE— >^iE^Seti} 
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{Iift+ : [Sum] (n E) >-> fj, E} 

{liftN : Nh/iE} 

{Wt : (*E-> Type ->■ Seti } 

{e e' : LazyCoercion (/i E) } 

-> _^+_ {E} {_^_} {lift+} {liftN} e e' 

-> WtSum {E} {Wt} {lift+} er 

— > Wt (coerce e') r 
preservation-Surri2 (stepl stei) (ok-sum wtei wte2) 

= * (ok-sum * wte2) 
preservation-Sum2 (steprstei) (ok-sum wtei wte2) 

= * (ok-sum wtei *) 
preservation-Surri2 stepv (ok-sum wtei wte2) 

= * (n +N m) 

It seems we're only missing two pieces: we need to be able to 
lift welltyped sums and naturals into Wt; and we need some 
way of expressing the induction hypothesis which states that 
because e\ is welltyped and stepped, e[ is welltyped too. 
The induction hypothesis is slightly stranger than was the 
case in our MonohthicExpr's because we know ei and c r \ 
are welltyped despite the fact that they are any expressions, 
not necessarily just sums. This motivates our solution which 
takes the induction hypothesis as an explicit assumption. 



preservation-Sum 
{E : Functor} 



{r : Type} 



{_ — ► _ : ^E— >^iE^Seti} 
{lift+ : [Sum] ((i E) « (i E} 
{liftN : N >-» n E} 
{Wt : fj, E -¥ Type ->■ Seti } 
{a b : LazyCoercion (fj, E)} 
-> ((n : N) -> Wt (apply liftN n) TNat) 
^(V{5}{e} 

-> WtSum {E} {Wt} {lift+} (inj lift+ e) 8 
-> Wt (apply lift+ e) 8) 
->• (V{<5} {ee'J^e — > e -> Wt e 8 ->• Wt e' 8) 
-»• _^+_ {E} {_^_} {lift+} {liftN} a b 

WtSum {E} {Wt} {lift+} a r 
-¥ Wt (coerce b) r 
preservation-Sum wtnat wt IH 
(stepl stei) (ok-sum wtei wte2) 
= wt (ok-sum (IH stei wtei) wte2) 
preservation-Sum wtnat wt IH 
(stepr ste2) (ok-sum wtei wte2) 
= wt (ok-sum wtei (IH ste2 wte2)) 
preservation-Sum wtnat wt IH 

(stepv {n} {m}) (ok-sum wtei wte2) 
= wtnat (n +N m) 

We are pleased with how similar this is to the original, 
monolithic formulation. Notice again that the solution was 
to factor out assumptions about the outside world similar 
to the previous abstractions. Proving type preservation for 
arrays is similarly natural: 



preservation- Array 
{E : Functor} 



{r : Type} 



{_ — >_ : ^E— >^iE^Seti} 
{liftA : [Array] (/x E) ~ (/z E)} 
{liftN : N >-» (j, E} 
{liftO : [Option] (fj, E) E)} 
{Wt : n E ->■ Type —¥ Seti } 
{a b : LazyCoercion (/i E)} 

->■ ((m : [Option] (p, E)) -»■ Wt (apply liftO m) TOption) 

^(V{«5}{e} 



->■ WtArray {E} {Wt} {liftA} {liftN} (inj liftA e) 8 
->■ Wt (apply liftA e) 8) 

-> (V {8} [eej^e — > e -> Wt e 8 ->• Wt e' 8) 

-> [] _{E} {_^_} {liftA} {liftN} {liftO} a b 

->■ WtArray {E} {Wt} {liftA} {liftN} a r 

— > Wt (coerce b) r 
preservation-Array wtopt wt IH 

(stepi ste) (ok-lookup wta wte) 

= wt (ok-lookup wta (IH ste wte)) 
preservation-Array wtopt wt IH 

(lookup {a} {n}) (ok-lookup wta wte) 

= wtopt L[ a, n J 

It would seem we're nearly done and the final pieces should 
be entirely guided by the selected abstractions. The lift 
functions each have a unique solution: 

lift" 1 " : [Sum] Expr >— > Expr 
lift+ = inj (right (left (refl))) 

liftN : N >-» Expr 

liftN = inj (left (left (left refl))) 

liftO : [Option] Expr >— > Expr 
liftO = inj (right (left (left refl))) 
liftA : [Array] Expr Expr 
liftA = inj (right refl) 

But how should we define welltypedness for Expr? Again the 
notion of what it means to be welltyped has already been 
defined and we simply need to "tie the knot" as RecExpr did 
above 

data WtExpr : Expr — > Type —5- Seti where 

lift-wt-nat : (n : N) -¥ WtExpr (apply liftN n) TNat 
lift-wt-option : (m : [Option] Expr) 

-¥ WtExpr (apply liftO m) TOption 
lift-wt-sum : {r : Type} {e : [Sum] Expr} 

-> WtSum {FExpr} {WtExpr} {lift+} (inj lift+ e) r 

-> WtExpr (apply lift" 1 " e) r 
lift-wt-array : {r : Type} {e : [Array] Expr} 

->■ WtArray {FExpr} {WtExpr} {liftA} {liftN} 
(inj liftA e) r 

-¥ WtExpr (apply liftA e) r 

To define a step relation on Expr, — — > — we provide a 
similar wrapping for each language component 

data _ — >_ : Expr — > Expr — > Seti where 

step + : {e : [Sum] Expr} {e' : LazyCoercion Expr} 
-> _^+_ {FExpr} {_^_} {lift+} {liftN} 
(inj lift" 1 " e) e' 
apply lift" 1 " e — > coerce e' 
step [] : {e : [Array] Expr} {e' : LazyCoercion Expr} 



{FExpr} {_- 
(inj liftA e) e' 
> apply liftA e 



} {liftA} {liftN} {liftO} 



The only piece remaining is to prove type preservation. We 
begin in the same way we have for each of the previous proofs 
using the step relation's constructors as a guide. The type 
signature should not have changed 

preservation : fee' : Expr} {r : Type} 
— S- e — > e — ► WtExpr e r — S- WtExpr e' r 

and there are two cases step + and step[]; moreover we 
should expect to merely apply preservation-* to each case, 
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supplying the necessary lift functions and the induction 
hypothesis. This is indeed the case: 

preservation (step + ste) (lift-wt-sum wts) 
= preservation-Sum lift-wt-nat lift-wt-sum 
preservation ste wts 
preservation (step [] ste) (lift-wt-array wta) 

= preservation-Array lift-wt-option lift-wt-array 
preservation ste wta 

Having shown type preservation it is interesting to see the 
similarity between how terms are shown to be welltyped and 
to evaluate and how the terms are expressed in uFExpr. 
Recall that each term in Expr is wrapped by a tag — given 
by inji and inji — and the constructor inn plays the role of 
recursion. To reiterate consider the convenience functions, 

nilE : Expr 

nilE = inn (inj 2 (inji (inj 2 tt))) 
nat : N — > Expr 

nat n = inn (inji (inji (inji n))) 

_ [_] =_ : Expr — ¥ Expr — »• Expr — > Expr 

a [n] = e = apply liftA (a [n] := e) 

_!E_ : Expr — > Expr — ► Expr 

a !E n = apply liftA (a ! n) 

_ + E_ : Expr — ► Expr — ► Expr 

ei +E e 2 = apply lift" 1 " (ei,e 2 ) 

we may then ask: why is the term 

exp : Expr 

exp = (nilE[natO] = nat 1) IE (nat +E nat 1) 

welltyped? The answer given by WtExpr is 

wt-exp : WtExpr exp TOption 
wt-exp = lift-wt-array (ok-lookup wta wt+) 
where 

wta : WtExpr (nilE [nat 0] = nat 1) TArray 
wta = lift-wt-array 

(ok-ins (lift-wt-array ok-nil) 
(lift-wt-nat 1) (lift-wt-nat 0)) 
wt+ : WtExpr (nat +E nat 1) TNat 
wt-l- = lift-wt-sum (ok-sum 

(lift-wt-nat 0) (lift-wt-nat 1)) 

The lift—wt—* functions play the same role in WtExpr as 
inn does in Expr; however rather than using the generalized 
approach of a scries of disjoint sums we bundle the tag 
and recursion into a single constructor for each language 
component. Evaluation displays a similar symmetry 

eval-expr : (nilE [nat 0] = nat 1) IE (nat +E nat 1) 

— ► (nilE [nat 0] = nat 1) IE nat 1 
eval-expr — step [] (stepi (step + stepv)) 

What does the proof that (nilE [nat 0] = nat 1) !_Enat 1 is 
welltyped look like? We can compute it by invoking 

preservation eval-expr wt-exp 

which evaluates to 

lift-wt-array 
(ok-lookup 
(lift-wt-array 

(ok-ins (lift-wt-array ok-nil) (lift-wt-nat 1) (lift-wt-nat 0))) 
(lift-wt-nat 1)) 



7. Related Work 

Independent and concurrently with our work, Delaware, et 
al. [7] developed a solution to moduler meta-theory in Coq. 
Both their approach and ours relies on the principle of repre- 
senting data types as functors; however they have chosen to 
express inductive types using Church encodings and recur- 
sive evaluation using Mendler algebras, which requires some 
extra sophistication. Here we express types as data mem- 
bers of the family of polynomial functors and apply recur- 
sive evaluation directly. Their approach presented is further 
along and has shown the important level of robustness re- 
quired by most languages while there are more unanswered 
questions regarding the method presented here. 

8. Conclusion and Future Work 

We should ask if we have accomplished the goal that we 
set out with. The language Expr was given componentwise 
and the boiler-plate necessary to wrap each welltyping and 
step relation is minimal. The proof of type preservation was 
almost immediate, requiring only an invocation of previously 
defined proofs for each component. Moreover there is no 
copy and paste necessary and the repetitive components 
should be automatically producable given a sophisticated 
macro system where terms can be inspected by name — set 
equality is non-deterministic — rather than value. 

Using Agda as a proof language, although convenient, 
leaves the question of consistency open. We regard this as 
a minor problem and hope that our implementation would 
port to Coq. A more pertinent problem is the definition of 
preservation for Expi — Agda is unable to prove termination 
and we plan to address this soon. 

The language presented is quite simple, unable to express 
even Euclid's algorithm, and the method of polynomial 
functor's used to express Expr precludes the possibility of 
first class function types which are critical for functional 
programming. Various solutions to this problem have been 
proposed [5] and the area of recursion schemes is rich [6]. 
A real world language calls for much heavier sophistication, 
but the ideas presented here are new and their reach is open 
to question and requires further exploration. 
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